Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fork Sync: Update from parent repository #1

Open
wants to merge 1,222 commits into
base: main
Choose a base branch
from

Conversation

github-actions[bot]
Copy link

No description provided.

pixelflinger and others added 30 commits July 1, 2024 11:25
the failure could happen if the shader didn't have any #extension
strings, which was likely to happen on release builds (e.g. on 
emulator).
- in case of failure we were munmap'ing the wrong size for the
  guard page (in practice this never happened)
- the post-condition check was incorrect; it checked for nullptr instead
  of MAP_FAILED. this also never happened in practice.

Also made a couple of small improvements:

- in case the special circular buffer mapping fails, log a message
  as warning instead of debug.

- immediately memset (i.e. populate) the pages for the circular buffer
  since they will all be accessed rather quickly.
renderStandaloneView must be called outside of beginFrame() /
endFrame(). This extra check ensures this behavior.
- we use a circular buffer for the frame history so that 
  we don't have to copy the data when insert a new entry.
  This also allows us to keep a reference to an entry, which
  doesn't get invalidated when an entry is added/removed.

- we now store the gpu frame time in the correct slot (instead of
  always the latest). It didn't matter before because the API wasn't
  public and we only needed some recent frame time.

- a new public API now returns the frame history, which now contains 
  more data; in particular the main and backend thread's begin/end
  frame time.


BUGS=[321110544]
- reduce the number of calls to notify_one() and notify_all().
  notify_one() is not only called when running a new job, and
  notify_all() only when a job finishes.

- don't hold the condition lock while calling notify_*(), as it is not
  strictly needed, and because notify_*() can be very slow, there can
  be a lot of contention on this lock as a result; blocking the whole
  jobsystem thread pool.

- add a new version of run() that takes an opaque thread id that can
  be retrieved from a job's execute function; this is especially
  intended to be used by parallel_for(); it's just a more efficient
  version of run() that avoids a hashmap lookup.


Overall these change yield a significant performance boost:
- running + waiting a job: +200%
- running many jobs: +150%
- running many jobs in parallel: +50%
The batch size was calculated incorrectly, which in theory could lead
to command buffer overflows (albeit unlikely).
* improve parallel_for a bit

We get about 40% performance increase. The gain comes from not having
to copy the JobData structure each time we create a job, by using a
new emplaceJob() method, we can create the structure directly into
its destination.

* avoid calling wakeAll() when possible

wakeAll() is very expensive and not always needed when a job finishes
because there may not be anyone waiting on that job.

We now maintain a waiter count per job, and use that to determine if
we need to notify or not.

And now that the JobSystem overhead is lower, we can decrease the size
of the jobs, which improves the load balancing.

* mActiveJobs fixes

some comments claimed mActiveJobs needed to be modified before or after
accessing the WorkQueue; this couldn't be correct because there were no
guaranteed global ordering with the workQueue.
The ResourceAllocator used to be global and owned by the Engine, this
was causing some issues when using several Renderers because each
one could cause the eviction of cache data for another.

We now have a ResourceAllocator per Renderer, which makes more sense
because most resources are allocated by the FrameGraph.

We also introduce a ResourceAllocatorDisposer class, which is used
for checking in and out a texture from the cache, and destroy the
texture when it's checked-out. That objet is still global.
Add Renderer::skipFrame() which should be called when intentionally 
skipping frames, for instance because the screen content hasn't changed,
allowing Filament to performance needed periodic tasks, such as
cache garbage collection and callback dispatches.

We also improve the ResourceAllocator cache eviction policy:
- the cache is aggressively purged when skipping a frame
- we aggressively evict entries older than 3 frames

The default Config is now set to a more agressive setting.
if it can't find a name for a node, it will revert to the config's
defaultNodeName, however, if that is nullptr also, a crash will occur.
so we provide a last-resort hardcoded name in that case.
Adds multiview support for vulkan. This is done by adding a layerCount to the renderTarget, which is used to determine if multiview is available and being used in the current renderpass.

FIXES=[332425392]

Co-authored-by: Powei Feng <[email protected]>
This fixes a crash introduced by a8ace28

The refactored FrameInfoManager can cause a crash when IBL resource loading
happens because now the getLastFrameInfo() references an invalid value via the
`front` method. Return the default FrameInfo to resolve this.

Also fix a null pointer reference bug for OpenGLTimer::State, which
happenes when the renderer for IBLPrefilterContext is destroyed.
Value of mCgltfBuffersLoaded is sometimes retained across creation of FAssetLoader which skips loading the buffer in AssetLoaderExtended#createPrimitive leading to null pointer crash
this change shouldn't have any impact on ARM, however, according
to cppreference it's not safe to mix seq_cst with other memory
orders:

"as soon as atomic operations that are not tagged memory_order_seq_cst 
enter the picture, the sequential consistency guarantee for the program 
is lost"
This function attempts to set texture parameters in these two cases, but the
texture is not guaranteed to be bound. Perhaps it once was, but the assumption
broke at some point.
This commit 730bc99 introduced a new
dependency on ResourceAllocator because of the new field
`std::unique_ptr<ResourceAllocator> mResourceAllocator{};` in
details/Renderer.h

This requires cpp files including details/Renderer.h to include
ResourceAllocator.h as well.

This compile issue only happens on the Windows compiler, Visual Studio.
poweifeng and others added 30 commits November 25, 2024 15:45
When releasing array elements, we need to pass 0 as the
last argument if we want to copy local chnages back to
the Java array. JNI_ABORT should only be used when we
don't want the copy, i.e. when the array is read-only.

Fixes issue #8278.
The workaround which were intended to be used with the native driver
only, were picked-up by the webgl backend when running on firefox,
probably because of the different GL string.

BUGS=[380425595]
When a viewport with a negative left/bottom was used, an assert would
trigger on debug builds and release builds would get a wrong scissor.

This bug was introduced recently, we now do the offset/clipping math
in 64 bits, which is way simpler and convert back to Viewport at
the end.

BUGS=[381902791]
- depthClamp is only supported from iOS 11.0
- reset the depthClamp state at the beginning of the renderpass


BUGS=[379729888]
This restores the old behavior with depth variant caching. We pay the
price at engine init time, instead of when the variant is needed the
first time. We can revisit this later.

Note that the default material doesn't have all possible depth variants
(e.g. VSM), but that pre-caches the most popular ones.

BUGS=[381946222]
Because ColorPassDescriptorSet is used for both the color pass and the
picking/ssr/structure passes, the texture it holds can be stale.

In this CL we fix this by reseting the offending textures when preparing
the picking/ssr/structure passes. This has a side effect of recreating
the descriptor-set internally. A better fix would be to use separate
descriptor-sets for these two cases and rely on the texture cache
to avoid recreation from a frame to the next.

BUGS=[376705346]
When using async shader compilation, the compiler thread pool is a 
standard pthread and not an NSThread. Therefore, it needs a manual
@autorelease pool, otherwise ARC objects never get truly released. 

BUGS=[383167935]
A ResourceList<> object was leaked when destroying a material until the 
Engine was shut down. This could however grow unbounded if churning 
through Materials.

What was actually leaked was entries in the hashmap linking a material
to its material instance list.

BUGS=[383158161]
- Platform changes for the extensions
- Platform changes for the processing of the AHB
When tangents are not supplied in a material, all "normals" related
public methods become undefined; remove access to them so we get 
compile time errors instead of garbage values in the shader.
The `ShadowCascade` functions for computing splits all attempted to clamp `cascades` to [0,4] (evidently to avoid out-of-bounds access), but used `max`, instead giving a range of [4, x].

For applications using less than 4 cascades, this results in incorrect split locations and lower quality (much more obvious seam between cascades).

This change just switches to use `min` to ensure `cascades` is in the range [0,4].
To allow for use-after-free discovery with ref-counting, we keep
a bit to indicate whether the handle is considered destroyed by
Filament. We assert against "cast"-ing a destroyed handle.
If the shared osmesa lib is not found, then we assume that the
library has been compile-time linked via the linker.  This is to
accommodate build frameworks where compile-time linking is
preferred.
- use a lanczos filter for sampling the color buffer, instead of a
  blackman-Harris window. This improves sharpness quite a bit.

- some cleanups of the shader code

- never use YCoCg when rectification is not enabled.

- fix the calculation of the confidence paramter when upscaling is used.


Upscaling works a lot better now, but it is still work in progress.
…a pointer to non-trivially copyable type 'value_type' (aka 'filament::DescriptorSet::Desc') [-Werror,-Wnontrivial-memcall]

  152 |     memcpy(set.mDescriptors.data(), mDescriptors.data(), mDescriptors.size() * sizeof(Desc));
Fix 3 bugs wrt matdbg:

1. The numbering scheme for materials with same name was buggy
2. The shared depth variants of the default material should also
   show up as active material variants.
3. When two instantiation of FMaterial point to the exact same
   bits, we still need to treat them as two materials. (Otherwise
   we wouldn't be able to modify one of them in matdbg).
* Skeleton of fgviewer debug server

* Update the map for fg info

* Add ApiHandler skeleton class

* Format the files

Format the file

* Prevent fgviewer being built before ready

Add comment to fgviewer related lines

Fix cmake commands

* Update the copyright

* Address the comments

Address the comments

* Address the comments

* Specify the debug server

* Fix build options
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.